Background

Acute graft-versus-host disease (aGVHD) is one of the critical complications following allogeneic hematopoietic stem cell transplantation (HSCT). Thus far, various types of prediction scores have been invented using statistical maneuvers, such as multivariate analyses. Recent progress in the field of machine learning algorithms, which are part of a data mining approach, suggested the application of this technique for the establishment of a novel GVHD risk prediction index using pre-HSCT parameters. The primary objective of this study was to establish and validate such index for aGVHD (grades 2-4 and 3-4).

Methods

This study was a database dependent retrospective cohort study analyzing the data of adult recipients of HSCT obtained from the registry of Japanese Society for Hematopoietic Cell Transplantation. Pre-HSCT parameters, such as those for patients, donors, conditioning regimens, and other procedures were retrieved from the database and introduced into the data mining approach. The alternating decision tree (ADTree) machine learning algorithm was applied to develop a model. This cohort was randomly divided into the training cohort (70% of the entire dataset) and the validation cohort (the remaining 30%). The algorithm was trained and tested using a 10-fold cross validation on the training cohort. The ADTree was validated in the validation cohort using the competitive risk hazard model.

Results

In total, 26,695 patients transplanted from allogeneic donors since 1992 to 2016 were included in this study. More than half of the patients were treated for acute myeloid leukemia or myelodysplastic syndrome (50.9%), followed by acute lymphoblastic leukemia (19.2%) and non-Hodgkin lymphoma (8.3%). The cumulative incidence of grades 2-4 and 3-4 aGVHD was 42.8% (95% confident interval [CI], 42.2 - 43.4%) and 17.1% (95%CI, 16.6 - 17.5%), respectively.

Predictive ADTree models were established using the training cohort (N = 17,244). Out of >30 variables considered, 15 variables, such as underlying disease, donor source, HLA and sex mismatch, conditioning regimen, GVHD prophylaxis, and donor age, were adapted into each model for aGVHD prediction (Figure 1). Cross validation demonstrated that the models' discrimination for the incidence of aGVHD was appropriate (area under curve: 0.616 and 0.623 for grades 2-4 and 3-4, respectively). These models were tested in the validation cohort (N = 8,050), and the incidence of aGVHD was clearly stratified according to the categorized ADTree scores (Figure 2). The cumulative incidence of grade 2-4 aGVHD was 29.0% for low risk, 35.3% for low-intermediate risk (hazard ratio [HR] compared with the low risk, 1.26; 95%CI, 1.11 - 1.42), 41.8% for intermediate risk (HR, 1.56; 95%CI, 1.39 - 1.76), 48.7% for high-intermediate risk (HR, 2.00; 95%CI, 1.78 - 2.24), and 58.7% for high risk (HR, 2.57; 95%CI, 2.30 - 2.88). Whereas, the cumulative incidence of grade 3-4 aGVHD was 8.6% for low risk, 12.5% for low-intermediate risk (HR, 1.53; 95%CI, 1.20 - 1.93), 14.9% for intermediate risk (HR, 1.85; 95%CI, 1.48 - 2.31), 21.1% for high-intermediate risk (HR, 2.72; 95%CI, 2.19 - 3.38), and 28.6% for high risk (HR, 3.87; 95%CI, 3.13 - 4.78). These two scores for aGVHD also demonstrated the relationship with the inferior overall survival after HSCT.

Discussion

Variables automatically extracted through machine learning algorithms (i.e., ADTree), in the absence of any bias from researchers, were clinically reasonable, and the obtained systems provided robust risk stratification scores for the incidence of aGVHD following allogeneic HSCT. The high reproducibility and freedom from the interactions among the variables indicate that ADTree, along with the other data mining approaches, may be widely used in the establishment of risk score. The present results should be validated in the other patient cohorts through future studies worldwide.

Disclosures

Ichinohe:Nippon Shinyaku Co.: Research Funding; Ono Pharmaceutical Co.: Research Funding; Otsuka Pharmaceutical Co.: Research Funding; Repertoire Genesis Inc.: Research Funding; MSD: Research Funding; Pfizer: Research Funding; JCR Pharmaceuticals: Honoraria; Celgene: Honoraria; Takeda Pharmaceutical Co.: Research Funding; Zenyaku Kogyo Co.: Research Funding; Sumitomo Dainippon Pharma Co.: Research Funding; Taiho Pharmaceutical Co.: Research Funding; Alexion Pharmaceuticals: Honoraria; Bristol-Myers Squibb: Honoraria; Janssen Pharmaceutical K.K.: Honoraria; Mundipharma: Honoraria; Novartis.: Honoraria; Kyowa Hakko Kirin Co.: Research Funding; Eisai Co.: Research Funding; CSL Behring: Research Funding; Chugai Pharmaceutical Co.: Research Funding; Astellas Pharma: Research Funding. Kanda:Eisai: Consultancy, Honoraria, Research Funding; Otsuka: Research Funding; Asahi-Kasei: Research Funding; Pfizer: Research Funding; Taisho-Toyama: Research Funding; Shionogi: Consultancy, Honoraria, Research Funding; Nippon-Shinyaku: Research Funding; Tanabe-Mitsubishi: Research Funding; Sanofi: Research Funding; Takeda: Consultancy, Honoraria, Research Funding; Bristol-Myers Squibb: Consultancy, Honoraria; MSD: Research Funding; Ono: Consultancy, Honoraria, Research Funding; CSL Behring: Research Funding; Dainippon-Sumitomo: Consultancy, Honoraria, Research Funding; Taiho: Research Funding; Kyowa-Hakko Kirin: Consultancy, Honoraria, Research Funding; Chugai: Consultancy, Honoraria, Research Funding; Astellas: Consultancy, Honoraria, Research Funding; Novartis: Research Funding; Celgene: Consultancy, Honoraria; Mochida: Consultancy, Honoraria; Alexion: Consultancy, Honoraria; Takara-bio: Consultancy, Honoraria.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution